In [17]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
In [18]:
pd.read_csv(r'C:\Users\ranjith valthaje\Project_uber/uber-raw-data-janjune-15.csv',encoding ='utf-8')
Out[18]:
Dispatching_base_num Pickup_date Affiliated_base_num locationID
0 B02617 2015-05-17 09:47:00 B02617 141
1 B02617 2015-05-17 09:47:00 B02617 65
2 B02617 2015-05-17 09:47:00 B02617 100
3 B02617 2015-05-17 09:47:00 B02774 80
4 B02617 2015-05-17 09:47:00 B02617 90
... ... ... ... ...
14270474 B02765 2015-05-08 15:43:00 B02765 186
14270475 B02765 2015-05-08 15:43:00 B02765 263
14270476 B02765 2015-05-08 15:43:00 B02765 90
14270477 B02765 2015-05-08 15:44:00 B01899 45
14270478 B02765 2015-05-08 15:44:00 B02682 144

14270479 rows × 4 columns

In [19]:
uber_15 = pd.read_csv(r'C:\Users\ranjith valthaje\Project_uber/uber-raw-data-janjune-15.csv',encoding ='utf-8')
In [20]:
uber_15
Out[20]:
Dispatching_base_num Pickup_date Affiliated_base_num locationID
0 B02617 2015-05-17 09:47:00 B02617 141
1 B02617 2015-05-17 09:47:00 B02617 65
2 B02617 2015-05-17 09:47:00 B02617 100
3 B02617 2015-05-17 09:47:00 B02774 80
4 B02617 2015-05-17 09:47:00 B02617 90
... ... ... ... ...
14270474 B02765 2015-05-08 15:43:00 B02765 186
14270475 B02765 2015-05-08 15:43:00 B02765 263
14270476 B02765 2015-05-08 15:43:00 B02765 90
14270477 B02765 2015-05-08 15:44:00 B01899 45
14270478 B02765 2015-05-08 15:44:00 B02682 144

14270479 rows × 4 columns

In [21]:
uber_15.shape
Out[21]:
(14270479, 4)
In [23]:
uber_15.head(5)
Out[23]:
Dispatching_base_num Pickup_date Affiliated_base_num locationID
0 B02617 2015-05-17 09:47:00 B02617 141
1 B02617 2015-05-17 09:47:00 B02617 65
2 B02617 2015-05-17 09:47:00 B02617 100
3 B02617 2015-05-17 09:47:00 B02774 80
4 B02617 2015-05-17 09:47:00 B02617 90
In [24]:
uber_15.tail(5)
Out[24]:
Dispatching_base_num Pickup_date Affiliated_base_num locationID
14270474 B02765 2015-05-08 15:43:00 B02765 186
14270475 B02765 2015-05-08 15:43:00 B02765 263
14270476 B02765 2015-05-08 15:43:00 B02765 90
14270477 B02765 2015-05-08 15:44:00 B01899 45
14270478 B02765 2015-05-08 15:44:00 B02682 144
In [25]:
uber_15.duplicated().sum()
Out[25]:
898225
In [26]:
uber_15.drop_duplicates(inplace = True)
In [27]:
uber_15.shape
Out[27]:
(13372254, 4)

month wise pickup¶

In [29]:
uber_15.dtypes
Out[29]:
Dispatching_base_num    object
Pickup_date             object
Affiliated_base_num     object
locationID               int64
dtype: object
In [30]:
pd.to_datetime(uber_15['Pickup_date'], format = '%Y-%m-%d %H:%M:%S')
Out[30]:
0          2015-05-17 09:47:00
1          2015-05-17 09:47:00
2          2015-05-17 09:47:00
3          2015-05-17 09:47:00
4          2015-05-17 09:47:00
                   ...        
14270474   2015-05-08 15:43:00
14270475   2015-05-08 15:43:00
14270476   2015-05-08 15:43:00
14270477   2015-05-08 15:44:00
14270478   2015-05-08 15:44:00
Name: Pickup_date, Length: 13372254, dtype: datetime64[ns]
In [31]:
uber_15_Pickup_date = pd.to_datetime(uber_15['Pickup_date'], format = '%Y-%m-%d %H:%M:%S')
In [32]:
uber_15_Pickup_date.dtype
Out[32]:
dtype('<M8[ns]')
In [36]:
uber_15_Pickup_date
Out[36]:
0          2015-05-17 09:47:00
1          2015-05-17 09:47:00
2          2015-05-17 09:47:00
3          2015-05-17 09:47:00
4          2015-05-17 09:47:00
                   ...        
14270474   2015-05-08 15:43:00
14270475   2015-05-08 15:43:00
14270476   2015-05-08 15:43:00
14270477   2015-05-08 15:44:00
14270478   2015-05-08 15:44:00
Name: Pickup_date, Length: 13372254, dtype: datetime64[ns]
In [34]:
uber_15_Pickup_date.dt.month
Out[34]:
0           5
1           5
2           5
3           5
4           5
           ..
14270474    5
14270475    5
14270476    5
14270477    5
14270478    5
Name: Pickup_date, Length: 13372254, dtype: int64
In [35]:
uber_15_month = uber_15_Pickup_date.dt.month
In [46]:
uber_15_month
Out[46]:
0           5
1           5
2           5
3           5
4           5
           ..
14270474    5
14270475    5
14270476    5
14270477    5
14270478    5
Name: Pickup_date, Length: 13372254, dtype: int64
In [37]:
uber_15_month.value_counts()
Out[37]:
6    2571771
5    2483980
2    2222189
4    2112705
3    2062639
1    1918970
Name: Pickup_date, dtype: int64
In [38]:
uber_15_month.value_counts().plot(kind = 'bar')
Out[38]:
<AxesSubplot:>
In [39]:
uber_15_month.value_counts().plot(kind = 'bar', figsize = (10,5))
Out[39]:
<AxesSubplot:>

Total trips for each month & each weekdays¶

In [40]:
uber_15['weekday'] = uber_15_Pickup_date.dt.day_name()
uber_15['day'] = uber_15_Pickup_date.dt.day
uber_15['hour'] = uber_15_Pickup_date.dt.hour
uber_15['month'] = uber_15_Pickup_date.dt.month
uber_15['minute'] = uber_15_Pickup_date.dt.minute
In [41]:
uber_15.head(5)
Out[41]:
Dispatching_base_num Pickup_date Affiliated_base_num locationID weekday day hour month minute
0 B02617 2015-05-17 09:47:00 B02617 141 Sunday 17 9 5 47
1 B02617 2015-05-17 09:47:00 B02617 65 Sunday 17 9 5 47
2 B02617 2015-05-17 09:47:00 B02617 100 Sunday 17 9 5 47
3 B02617 2015-05-17 09:47:00 B02774 80 Sunday 17 9 5 47
4 B02617 2015-05-17 09:47:00 B02617 90 Sunday 17 9 5 47
In [42]:
uber_15.groupby(['month', 'weekday']).size()
Out[42]:
month  weekday  
1      Friday       339285
       Monday       190606
       Saturday     386049
       Sunday       230487
       Thursday     330319
       Tuesday      196574
       Wednesday    245650
2      Friday       373550
       Monday       274948
       Saturday     368311
       Sunday       296130
       Thursday     335603
       Tuesday      287260
       Wednesday    286387
3      Friday       309631
       Monday       269931
       Saturday     314785
       Sunday       313865
       Thursday     277026
       Tuesday      320634
       Wednesday    256767
4      Friday       315002
       Monday       238429
       Saturday     324545
       Sunday       273560
       Thursday     372522
       Tuesday      250632
       Wednesday    338015
5      Friday       430134
       Monday       255501
       Saturday     464298
       Sunday       390391
       Thursday     337607
       Tuesday      290004
       Wednesday    316045
6      Friday       371225
       Monday       375312
       Saturday     399377
       Sunday       334434
       Thursday     357782
       Tuesday      405500
       Wednesday    328141
dtype: int64
In [70]:
type(uber_15.groupby(['month', 'weekday']).size())
Out[70]:
pandas.core.series.Series
In [71]:
uber_15.groupby(['month', 'weekday'], as_index = False).size()
Out[71]:
month weekday size
0 1 Friday 339285
1 1 Monday 190606
2 1 Saturday 386049
3 1 Sunday 230487
4 1 Thursday 330319
5 1 Tuesday 196574
6 1 Wednesday 245650
7 2 Friday 373550
8 2 Monday 274948
9 2 Saturday 368311
10 2 Sunday 296130
11 2 Thursday 335603
12 2 Tuesday 287260
13 2 Wednesday 286387
14 3 Friday 309631
15 3 Monday 269931
16 3 Saturday 314785
17 3 Sunday 313865
18 3 Thursday 277026
19 3 Tuesday 320634
20 3 Wednesday 256767
21 4 Friday 315002
22 4 Monday 238429
23 4 Saturday 324545
24 4 Sunday 273560
25 4 Thursday 372522
26 4 Tuesday 250632
27 4 Wednesday 338015
28 5 Friday 430134
29 5 Monday 255501
30 5 Saturday 464298
31 5 Sunday 390391
32 5 Thursday 337607
33 5 Tuesday 290004
34 5 Wednesday 316045
35 6 Friday 371225
36 6 Monday 375312
37 6 Saturday 399377
38 6 Sunday 334434
39 6 Thursday 357782
40 6 Tuesday 405500
41 6 Wednesday 328141
In [43]:
temp = uber_15.groupby(['month', 'weekday'], as_index = False).size()
In [44]:
temp.head()
Out[44]:
month weekday size
0 1 Friday 339285
1 1 Monday 190606
2 1 Saturday 386049
3 1 Sunday 230487
4 1 Thursday 330319
In [45]:
temp['month'].unique()
Out[45]:
array([1, 2, 3, 4, 5, 6], dtype=int64)
In [47]:
dict_month = {1:'Jan', 2:'Feb', 3:'March', 4:'April', 5:'May', 6:'June'}
In [48]:
temp['month'].map(dict_month)
Out[48]:
0       Jan
1       Jan
2       Jan
3       Jan
4       Jan
5       Jan
6       Jan
7       Feb
8       Feb
9       Feb
10      Feb
11      Feb
12      Feb
13      Feb
14    March
15    March
16    March
17    March
18    March
19    March
20    March
21    April
22    April
23    April
24    April
25    April
26    April
27    April
28      May
29      May
30      May
31      May
32      May
33      May
34      May
35     June
36     June
37     June
38     June
39     June
40     June
41     June
Name: month, dtype: object
In [49]:
temp_month = temp['month'].map(dict_month)
In [50]:
temp_month
Out[50]:
0       Jan
1       Jan
2       Jan
3       Jan
4       Jan
5       Jan
6       Jan
7       Feb
8       Feb
9       Feb
10      Feb
11      Feb
12      Feb
13      Feb
14    March
15    March
16    March
17    March
18    March
19    March
20    March
21    April
22    April
23    April
24    April
25    April
26    April
27    April
28      May
29      May
30      May
31      May
32      May
33      May
34      May
35     June
36     June
37     June
38     June
39     June
40     June
41     June
Name: month, dtype: object
In [79]:
temp
Out[79]:
month weekday size
0 1 Friday 339285
1 1 Monday 190606
2 1 Saturday 386049
3 1 Sunday 230487
4 1 Thursday 330319
5 1 Tuesday 196574
6 1 Wednesday 245650
7 2 Friday 373550
8 2 Monday 274948
9 2 Saturday 368311
10 2 Sunday 296130
11 2 Thursday 335603
12 2 Tuesday 287260
13 2 Wednesday 286387
14 3 Friday 309631
15 3 Monday 269931
16 3 Saturday 314785
17 3 Sunday 313865
18 3 Thursday 277026
19 3 Tuesday 320634
20 3 Wednesday 256767
21 4 Friday 315002
22 4 Monday 238429
23 4 Saturday 324545
24 4 Sunday 273560
25 4 Thursday 372522
26 4 Tuesday 250632
27 4 Wednesday 338015
28 5 Friday 430134
29 5 Monday 255501
30 5 Saturday 464298
31 5 Sunday 390391
32 5 Thursday 337607
33 5 Tuesday 290004
34 5 Wednesday 316045
35 6 Friday 371225
36 6 Monday 375312
37 6 Saturday 399377
38 6 Sunday 334434
39 6 Thursday 357782
40 6 Tuesday 405500
41 6 Wednesday 328141
In [51]:
plt.figure(figsize = (12,8))
sns.barplot(x ='month', y = 'size', hue = 'weekday', data = temp)
Out[51]:
<AxesSubplot:xlabel='month', ylabel='size'>

Hourly rush in New york city on all days¶

In [52]:
summary = uber_15.groupby(['weekday','hour'], as_index = False).size()
In [53]:
summary
Out[53]:
weekday hour size
0 Friday 0 79879
1 Friday 1 44563
2 Friday 2 27252
3 Friday 3 19076
4 Friday 4 23049
... ... ... ...
163 Wednesday 19 131317
164 Wednesday 20 123490
165 Wednesday 21 120941
166 Wednesday 22 115208
167 Wednesday 23 91631

168 rows × 3 columns

In [54]:
plt.figure(figsize = (12,8))
sns.pointplot(x ='hour', y ='size', hue ='weekday', data = summary)
Out[54]:
<AxesSubplot:xlabel='hour', ylabel='size'>

which base_number has most number of Active Vehicles?¶

In [55]:
pd.read_csv(r'C:\Users\ranjith valthaje\Project_uber/Uber-Jan-Feb-FOIL.csv')
Out[55]:
dispatching_base_number date active_vehicles trips
0 B02512 1/1/2015 190 1132
1 B02765 1/1/2015 225 1765
2 B02764 1/1/2015 3427 29421
3 B02682 1/1/2015 945 7679
4 B02617 1/1/2015 1228 9537
... ... ... ... ...
349 B02764 2/28/2015 3952 39812
350 B02617 2/28/2015 1372 14022
351 B02682 2/28/2015 1386 14472
352 B02512 2/28/2015 230 1803
353 B02765 2/28/2015 747 7753

354 rows × 4 columns

In [56]:
uber_foil = pd.read_csv(r'C:\Users\ranjith valthaje\Project_uber/Uber-Jan-Feb-FOIL.csv')
In [57]:
uber_foil.head()
Out[57]:
dispatching_base_number date active_vehicles trips
0 B02512 1/1/2015 190 1132
1 B02765 1/1/2015 225 1765
2 B02764 1/1/2015 3427 29421
3 B02682 1/1/2015 945 7679
4 B02617 1/1/2015 1228 9537
In [58]:
!pip install chart_studio
!pip install plotly
Requirement already satisfied: chart_studio in c:\python\lib\site-packages (1.1.0)
Requirement already satisfied: six in c:\python\lib\site-packages (from chart_studio) (1.16.0)
Requirement already satisfied: retrying>=1.3.3 in c:\python\lib\site-packages (from chart_studio) (1.3.4)
Requirement already satisfied: plotly in c:\python\lib\site-packages (from chart_studio) (5.9.0)
Requirement already satisfied: requests in c:\python\lib\site-packages (from chart_studio) (2.28.1)
Requirement already satisfied: tenacity>=6.2.0 in c:\python\lib\site-packages (from plotly->chart_studio) (8.0.1)
Requirement already satisfied: charset-normalizer<3,>=2 in c:\python\lib\site-packages (from requests->chart_studio) (2.0.4)
Requirement already satisfied: certifi>=2017.4.17 in c:\python\lib\site-packages (from requests->chart_studio) (2022.9.14)
Requirement already satisfied: idna<4,>=2.5 in c:\python\lib\site-packages (from requests->chart_studio) (3.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\python\lib\site-packages (from requests->chart_studio) (1.26.11)
Requirement already satisfied: plotly in c:\python\lib\site-packages (5.9.0)
Requirement already satisfied: tenacity>=6.2.0 in c:\python\lib\site-packages (from plotly) (8.0.1)
In [59]:
import chart_studio.plotly as py
import plotly.graph_objs as go
import plotly.express as px
from plotly.offline import download_plotlyjs, plot, iplot, init_notebook_mode
init_notebook_mode(connected=True)
In [60]:
px.box(x ='dispatching_base_number', y = 'active_vehicles', data_frame = uber_foil)
In [61]:
px.violin(x = 'dispatching_base_number', y = 'active_vehicles', data_frame = uber_foil)

Collect Data and Make It ready for Data Analysis¶

In [62]:
import os
In [63]:
os.listdir(r'C:\Users\ranjith valthaje\Project_uber')
Out[63]:
['other-American_B01362.csv',
 'other-Carmel_B00256.csv',
 'other-Dial7_B00887.csv',
 'other-Diplo_B01196.csv',
 'other-Federal_02216.csv',
 'other-FHV-services_jan-aug-2015.csv',
 'other-Firstclass_B01536.csv',
 'other-Highclass_B01717.csv',
 'other-Lyft_B02510.csv',
 'other-Prestige_B01338.csv',
 'other-Skyline_B00111.csv',
 'Uber-Jan-Feb-FOIL.csv',
 'uber-raw-data-apr14.csv',
 'uber-raw-data-aug14.csv',
 'uber-raw-data-janjune-15.csv',
 'uber-raw-data-jul14.csv',
 'uber-raw-data-jun14.csv',
 'uber-raw-data-may14.csv',
 'uber-raw-data-sep14.csv']
In [71]:
files = os.listdir(r'C:\Users\ranjith valthaje\Project_uber')[-7:]
In [72]:
files
Out[72]:
['uber-raw-data-apr14.csv',
 'uber-raw-data-aug14.csv',
 'uber-raw-data-janjune-15.csv',
 'uber-raw-data-jul14.csv',
 'uber-raw-data-jun14.csv',
 'uber-raw-data-may14.csv',
 'uber-raw-data-sep14.csv']
In [73]:
files.remove('uber-raw-data-janjune-15.csv')
In [74]:
files
Out[74]:
['uber-raw-data-apr14.csv',
 'uber-raw-data-aug14.csv',
 'uber-raw-data-jul14.csv',
 'uber-raw-data-jun14.csv',
 'uber-raw-data-may14.csv',
 'uber-raw-data-sep14.csv']
In [76]:
path = r'C:\Users\ranjith valthaje\Project_uber'

final = pd.DataFrame()


for file in files:
    current_df= pd.read_csv(path+'/'+file,encoding= 'utf-8')
    final = pd.concat([current_df,final])
In [77]:
final.shape
Out[77]:
(4534327, 4)
In [78]:
final.head()
Out[78]:
Date/Time Lat Lon Base
0 9/1/2014 0:01:00 40.2201 -74.0021 B02512
1 9/1/2014 0:01:00 40.7500 -74.0027 B02512
2 9/1/2014 0:03:00 40.7559 -73.9864 B02512
3 9/1/2014 0:06:00 40.7450 -73.9889 B02512
4 9/1/2014 0:11:00 40.8145 -73.9444 B02512
In [79]:
final.duplicated().sum()
Out[79]:
82581
In [80]:
final.drop_duplicates(inplace = True)
In [81]:
final.shape
Out[81]:
(4451746, 4)

# Calculating the Rush location in New York City¶

In [82]:
final.groupby(['Lat', 'Lon']).size()
Out[82]:
Lat      Lon     
39.6569  -74.2258    1
39.6686  -74.1607    1
39.7214  -74.2446    1
39.8416  -74.1512    1
39.9055  -74.0791    1
                    ..
41.3730  -72.9237    1
41.3737  -73.7988    1
41.5016  -72.8987    1
41.5276  -72.7734    1
42.1166  -72.0666    1
Length: 574558, dtype: int64
In [83]:
final.groupby(['Lat', 'Lon'],as_index = False).size()
Out[83]:
Lat Lon size
0 39.6569 -74.2258 1
1 39.6686 -74.1607 1
2 39.7214 -74.2446 1
3 39.8416 -74.1512 1
4 39.9055 -74.0791 1
... ... ... ...
574553 41.3730 -72.9237 1
574554 41.3737 -73.7988 1
574555 41.5016 -72.8987 1
574556 41.5276 -72.7734 1
574557 42.1166 -72.0666 1

574558 rows × 3 columns

In [84]:
rush_uber = final.groupby(['Lat', 'Lon'],as_index = False).size()
In [86]:
rush_uber
Out[86]:
Lat Lon size
0 39.6569 -74.2258 1
1 39.6686 -74.1607 1
2 39.7214 -74.2446 1
3 39.8416 -74.1512 1
4 39.9055 -74.0791 1
... ... ... ...
574553 41.3730 -72.9237 1
574554 41.3737 -73.7988 1
574555 41.5016 -72.8987 1
574556 41.5276 -72.7734 1
574557 42.1166 -72.0666 1

574558 rows × 3 columns

In [87]:
#!pip install folium
In [88]:
import folium
In [89]:
basemap = folium.Map()
In [90]:
from folium.plugins import HeatMap
In [93]:
HeatMap(rush_uber).add_to(basemap)
Out[93]:
<folium.plugins.heat_map.HeatMap at 0x1556c9d2bb0>
In [92]:
basemap
Out[92]:
Make this Notebook Trusted to load map: File -> Trust Notebook

# Rush on Hour & Weekday¶

In [94]:
final.tail(10)
Out[94]:
Date/Time Lat Lon Base
564506 4/30/2014 23:00:00 40.7316 -73.9891 B02764
564507 4/30/2014 23:04:00 40.7267 -73.9937 B02764
564508 4/30/2014 23:05:00 40.7788 -73.9600 B02764
564509 4/30/2014 23:15:00 40.7420 -74.0037 B02764
564510 4/30/2014 23:18:00 40.7514 -74.0066 B02764
564511 4/30/2014 23:22:00 40.7640 -73.9744 B02764
564512 4/30/2014 23:26:00 40.7629 -73.9672 B02764
564513 4/30/2014 23:31:00 40.7443 -73.9889 B02764
564514 4/30/2014 23:32:00 40.6756 -73.9405 B02764
564515 4/30/2014 23:48:00 40.6880 -73.9608 B02764
In [100]:
final['Date/Time'] = pd.to_datetime(final['Date/Time'], format = '%m/%d/%Y %H:%M:%S')
In [104]:
final['Weekday'] = final['Date/Time'].dt.day
final['hour'] = final['Date/Time'].dt.hour
In [105]:
final.head()
Out[105]:
Date/Time Lat Lon Base Weekday hour
0 2014-09-01 00:01:00 40.2201 -74.0021 B02512 1 0
1 2014-09-01 00:01:00 40.7500 -74.0027 B02512 1 0
2 2014-09-01 00:03:00 40.7559 -73.9864 B02512 1 0
3 2014-09-01 00:06:00 40.7450 -73.9889 B02512 1 0
4 2014-09-01 00:11:00 40.8145 -73.9444 B02512 1 0
In [113]:
final.head()
Out[113]:
Date/Time Lat Lon Base Weekday hour
0 2014-09-01 00:01:00 40.2201 -74.0021 B02512 1 0
1 2014-09-01 00:01:00 40.7500 -74.0027 B02512 1 0
2 2014-09-01 00:03:00 40.7559 -73.9864 B02512 1 0
3 2014-09-01 00:06:00 40.7450 -73.9889 B02512 1 0
4 2014-09-01 00:11:00 40.8145 -73.9444 B02512 1 0
In [109]:
final.groupby(['Weekday', 'hour']).size().unstack()
Out[109]:
hour 0 1 2 3 4 5 6 7 8 9 ... 14 15 16 17 18 19 20 21 22 23
Weekday
1 3178 1944 1256 1308 1429 2126 3664 5380 5292 4617 ... 6933 7910 8633 9511 8604 8001 7315 7803 6268 4050
2 2435 1569 1087 1414 1876 2812 4920 6544 6310 4712 ... 6904 8449 10109 11100 11123 9474 8759 8357 6998 5160
3 3354 2142 1407 1467 1550 2387 4241 5663 5386 4657 ... 7226 8850 10314 10491 11239 9599 9026 8531 7142 4686
4 2897 1688 1199 1424 1696 2581 4592 6029 5704 4744 ... 7158 8515 9492 10357 10259 9097 8358 8649 7706 5130
5 2733 1541 1030 1253 1617 2900 4814 6261 6469 5530 ... 6955 8312 9609 10699 10170 9430 9354 9610 8853 6518
6 4537 2864 1864 1555 1551 2162 3642 4766 4942 4401 ... 7235 8612 9444 9929 9263 8405 8117 8567 7852 5946
7 3645 2296 1507 1597 1763 2422 4102 5575 5376 4639 ... 7276 8474 10393 11013 10573 9472 8691 8525 7194 4801
8 2830 1646 1123 1483 1889 3224 5431 7361 7357 5703 ... 7240 8775 9851 10673 9687 8796 8604 8367 6795 4256
9 2657 1724 1222 1480 1871 3168 5802 7592 7519 5895 ... 7877 9220 10270 11910 11449 9804 8909 8665 7499 5203
10 3296 2126 1464 1434 1591 2594 4664 6046 6158 5072 ... 7612 9578 11045 11875 10934 9613 9687 9240 7766 5496
11 3036 1665 1095 1424 1842 2520 4954 6876 6871 5396 ... 7503 8920 10125 10898 10361 9327 8824 8730 7771 5360
12 3227 2147 1393 1362 1757 2710 4576 6250 6231 5177 ... 7743 9390 10734 11713 12216 10393 9965 10310 9992 7945
13 5408 3509 2262 1832 1705 2327 4196 5685 6060 5631 ... 8200 9264 10534 11826 11450 9921 8705 8423 7363 5936
14 3748 2349 1605 1656 1756 2629 4257 5781 5520 4824 ... 6963 8192 9511 10115 9553 9146 9182 8589 6891 4460
15 2497 1515 1087 1381 1862 2980 5050 6837 6729 5201 ... 7633 8505 10285 11959 11728 11032 10509 9105 7153 4480
16 2547 1585 1119 1395 1818 2966 5558 7517 7495 5958 ... 7597 9290 10804 11773 10855 10924 10142 10374 8094 5380
17 3155 2048 1500 1488 1897 2741 4562 6315 5882 4934 ... 7472 8997 10323 11236 11089 9919 9935 9823 8362 5699
18 3390 2135 1332 1626 1892 2959 4688 6618 6451 5377 ... 7534 9040 10274 10692 10338 9551 9310 9285 8015 5492
19 3217 2188 1604 1675 1810 2639 4733 6159 6014 5006 ... 7374 8898 9893 10741 10429 9701 10051 10049 9090 6666
20 4475 3190 2100 1858 1618 2143 3584 4900 5083 4765 ... 7462 8630 9448 10046 9272 8592 8614 8703 7787 5907
21 4294 3194 1972 1727 1926 2615 4185 5727 5529 4707 ... 7064 8127 9483 9817 9291 8317 8107 8245 7362 5231
22 2787 1637 1175 1468 1934 3151 5204 6872 6850 5198 ... 7337 9148 10574 10962 9884 8980 8772 8430 6784 4530
23 2546 1580 1136 1429 1957 3132 5204 6890 6436 5177 ... 7575 9309 9980 10341 10823 11347 11447 10347 8637 5577
24 3200 2055 1438 1493 1798 2754 4484 6013 5913 5146 ... 7083 8706 10366 10786 9772 9080 9213 8831 7480 4456
25 2405 1499 1072 1439 1943 2973 5356 7627 7078 5994 ... 7298 8732 9922 10504 10673 9048 8751 9508 8522 6605
26 3810 3065 2046 1806 1730 2337 3776 5172 5071 4808 ... 7269 8815 9885 10697 10867 10122 9820 10441 9486 7593
27 5196 3635 2352 2055 1723 2336 3539 4937 5053 4771 ... 7519 8803 9793 9838 9228 8267 7908 8507 7720 6046
28 4123 2646 1843 1802 1883 2793 4290 5715 5671 5206 ... 7341 8584 9671 9975 9132 8255 8309 7949 6411 4461
29 2678 1827 1409 1678 1948 3056 5213 6852 6695 5481 ... 7630 9249 10105 11113 10411 9301 9270 9114 6992 4323
30 2401 1510 1112 1403 1841 3216 5757 7596 7611 6064 ... 8396 10243 11554 12126 12561 11024 10836 10042 8275 4723
31 2174 1394 1087 919 773 997 1561 2169 2410 2525 ... 4104 5099 5386 5308 5350 4898 4819 5064 5164 3961

31 rows × 24 columns

In [110]:
pivot = final.groupby(['Weekday', 'hour']).size().unstack()
In [111]:
pivot
Out[111]:
hour 0 1 2 3 4 5 6 7 8 9 ... 14 15 16 17 18 19 20 21 22 23
Weekday
1 3178 1944 1256 1308 1429 2126 3664 5380 5292 4617 ... 6933 7910 8633 9511 8604 8001 7315 7803 6268 4050
2 2435 1569 1087 1414 1876 2812 4920 6544 6310 4712 ... 6904 8449 10109 11100 11123 9474 8759 8357 6998 5160
3 3354 2142 1407 1467 1550 2387 4241 5663 5386 4657 ... 7226 8850 10314 10491 11239 9599 9026 8531 7142 4686
4 2897 1688 1199 1424 1696 2581 4592 6029 5704 4744 ... 7158 8515 9492 10357 10259 9097 8358 8649 7706 5130
5 2733 1541 1030 1253 1617 2900 4814 6261 6469 5530 ... 6955 8312 9609 10699 10170 9430 9354 9610 8853 6518
6 4537 2864 1864 1555 1551 2162 3642 4766 4942 4401 ... 7235 8612 9444 9929 9263 8405 8117 8567 7852 5946
7 3645 2296 1507 1597 1763 2422 4102 5575 5376 4639 ... 7276 8474 10393 11013 10573 9472 8691 8525 7194 4801
8 2830 1646 1123 1483 1889 3224 5431 7361 7357 5703 ... 7240 8775 9851 10673 9687 8796 8604 8367 6795 4256
9 2657 1724 1222 1480 1871 3168 5802 7592 7519 5895 ... 7877 9220 10270 11910 11449 9804 8909 8665 7499 5203
10 3296 2126 1464 1434 1591 2594 4664 6046 6158 5072 ... 7612 9578 11045 11875 10934 9613 9687 9240 7766 5496
11 3036 1665 1095 1424 1842 2520 4954 6876 6871 5396 ... 7503 8920 10125 10898 10361 9327 8824 8730 7771 5360
12 3227 2147 1393 1362 1757 2710 4576 6250 6231 5177 ... 7743 9390 10734 11713 12216 10393 9965 10310 9992 7945
13 5408 3509 2262 1832 1705 2327 4196 5685 6060 5631 ... 8200 9264 10534 11826 11450 9921 8705 8423 7363 5936
14 3748 2349 1605 1656 1756 2629 4257 5781 5520 4824 ... 6963 8192 9511 10115 9553 9146 9182 8589 6891 4460
15 2497 1515 1087 1381 1862 2980 5050 6837 6729 5201 ... 7633 8505 10285 11959 11728 11032 10509 9105 7153 4480
16 2547 1585 1119 1395 1818 2966 5558 7517 7495 5958 ... 7597 9290 10804 11773 10855 10924 10142 10374 8094 5380
17 3155 2048 1500 1488 1897 2741 4562 6315 5882 4934 ... 7472 8997 10323 11236 11089 9919 9935 9823 8362 5699
18 3390 2135 1332 1626 1892 2959 4688 6618 6451 5377 ... 7534 9040 10274 10692 10338 9551 9310 9285 8015 5492
19 3217 2188 1604 1675 1810 2639 4733 6159 6014 5006 ... 7374 8898 9893 10741 10429 9701 10051 10049 9090 6666
20 4475 3190 2100 1858 1618 2143 3584 4900 5083 4765 ... 7462 8630 9448 10046 9272 8592 8614 8703 7787 5907
21 4294 3194 1972 1727 1926 2615 4185 5727 5529 4707 ... 7064 8127 9483 9817 9291 8317 8107 8245 7362 5231
22 2787 1637 1175 1468 1934 3151 5204 6872 6850 5198 ... 7337 9148 10574 10962 9884 8980 8772 8430 6784 4530
23 2546 1580 1136 1429 1957 3132 5204 6890 6436 5177 ... 7575 9309 9980 10341 10823 11347 11447 10347 8637 5577
24 3200 2055 1438 1493 1798 2754 4484 6013 5913 5146 ... 7083 8706 10366 10786 9772 9080 9213 8831 7480 4456
25 2405 1499 1072 1439 1943 2973 5356 7627 7078 5994 ... 7298 8732 9922 10504 10673 9048 8751 9508 8522 6605
26 3810 3065 2046 1806 1730 2337 3776 5172 5071 4808 ... 7269 8815 9885 10697 10867 10122 9820 10441 9486 7593
27 5196 3635 2352 2055 1723 2336 3539 4937 5053 4771 ... 7519 8803 9793 9838 9228 8267 7908 8507 7720 6046
28 4123 2646 1843 1802 1883 2793 4290 5715 5671 5206 ... 7341 8584 9671 9975 9132 8255 8309 7949 6411 4461
29 2678 1827 1409 1678 1948 3056 5213 6852 6695 5481 ... 7630 9249 10105 11113 10411 9301 9270 9114 6992 4323
30 2401 1510 1112 1403 1841 3216 5757 7596 7611 6064 ... 8396 10243 11554 12126 12561 11024 10836 10042 8275 4723
31 2174 1394 1087 919 773 997 1561 2169 2410 2525 ... 4104 5099 5386 5308 5350 4898 4819 5064 5164 3961

31 rows × 24 columns

In [112]:
pivot.style.background_gradient()
Out[112]:
hour 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Weekday                                                
1 3178 1944 1256 1308 1429 2126 3664 5380 5292 4617 4607 4729 4930 5794 6933 7910 8633 9511 8604 8001 7315 7803 6268 4050
2 2435 1569 1087 1414 1876 2812 4920 6544 6310 4712 4797 4975 5188 5695 6904 8449 10109 11100 11123 9474 8759 8357 6998 5160
3 3354 2142 1407 1467 1550 2387 4241 5663 5386 4657 4788 5065 5384 6093 7226 8850 10314 10491 11239 9599 9026 8531 7142 4686
4 2897 1688 1199 1424 1696 2581 4592 6029 5704 4744 4743 4975 5193 6175 7158 8515 9492 10357 10259 9097 8358 8649 7706 5130
5 2733 1541 1030 1253 1617 2900 4814 6261 6469 5530 5141 5011 5047 5690 6955 8312 9609 10699 10170 9430 9354 9610 8853 6518
6 4537 2864 1864 1555 1551 2162 3642 4766 4942 4401 4801 5174 5426 6258 7235 8612 9444 9929 9263 8405 8117 8567 7852 5946
7 3645 2296 1507 1597 1763 2422 4102 5575 5376 4639 4905 5166 5364 6214 7276 8474 10393 11013 10573 9472 8691 8525 7194 4801
8 2830 1646 1123 1483 1889 3224 5431 7361 7357 5703 5288 5350 5483 6318 7240 8775 9851 10673 9687 8796 8604 8367 6795 4256
9 2657 1724 1222 1480 1871 3168 5802 7592 7519 5895 5406 5443 5496 6419 7877 9220 10270 11910 11449 9804 8909 8665 7499 5203
10 3296 2126 1464 1434 1591 2594 4664 6046 6158 5072 4976 5415 5506 6527 7612 9578 11045 11875 10934 9613 9687 9240 7766 5496
11 3036 1665 1095 1424 1842 2520 4954 6876 6871 5396 5215 5423 5513 6486 7503 8920 10125 10898 10361 9327 8824 8730 7771 5360
12 3227 2147 1393 1362 1757 2710 4576 6250 6231 5177 5157 5319 5570 6448 7743 9390 10734 11713 12216 10393 9965 10310 9992 7945
13 5408 3509 2262 1832 1705 2327 4196 5685 6060 5631 5442 5720 5914 6678 8200 9264 10534 11826 11450 9921 8705 8423 7363 5936
14 3748 2349 1605 1656 1756 2629 4257 5781 5520 4824 4911 5118 5153 5747 6963 8192 9511 10115 9553 9146 9182 8589 6891 4460
15 2497 1515 1087 1381 1862 2980 5050 6837 6729 5201 5347 5517 5503 6997 7633 8505 10285 11959 11728 11032 10509 9105 7153 4480
16 2547 1585 1119 1395 1818 2966 5558 7517 7495 5958 5626 5480 5525 6198 7597 9290 10804 11773 10855 10924 10142 10374 8094 5380
17 3155 2048 1500 1488 1897 2741 4562 6315 5882 4934 5004 5306 5634 6507 7472 8997 10323 11236 11089 9919 9935 9823 8362 5699
18 3390 2135 1332 1626 1892 2959 4688 6618 6451 5377 5150 5487 5490 6383 7534 9040 10274 10692 10338 9551 9310 9285 8015 5492
19 3217 2188 1604 1675 1810 2639 4733 6159 6014 5006 5092 5240 5590 6367 7374 8898 9893 10741 10429 9701 10051 10049 9090 6666
20 4475 3190 2100 1858 1618 2143 3584 4900 5083 4765 5135 5650 5745 6656 7462 8630 9448 10046 9272 8592 8614 8703 7787 5907
21 4294 3194 1972 1727 1926 2615 4185 5727 5529 4707 4911 5212 5465 6085 7064 8127 9483 9817 9291 8317 8107 8245 7362 5231
22 2787 1637 1175 1468 1934 3151 5204 6872 6850 5198 5277 5352 5512 6342 7337 9148 10574 10962 9884 8980 8772 8430 6784 4530
23 2546 1580 1136 1429 1957 3132 5204 6890 6436 5177 5066 5304 5504 6232 7575 9309 9980 10341 10823 11347 11447 10347 8637 5577
24 3200 2055 1438 1493 1798 2754 4484 6013 5913 5146 4947 5311 5229 5974 7083 8706 10366 10786 9772 9080 9213 8831 7480 4456
25 2405 1499 1072 1439 1943 2973 5356 7627 7078 5994 5432 5504 5694 6204 7298 8732 9922 10504 10673 9048 8751 9508 8522 6605
26 3810 3065 2046 1806 1730 2337 3776 5172 5071 4808 5061 5179 5381 6166 7269 8815 9885 10697 10867 10122 9820 10441 9486 7593
27 5196 3635 2352 2055 1723 2336 3539 4937 5053 4771 5198 5732 5839 6820 7519 8803 9793 9838 9228 8267 7908 8507 7720 6046
28 4123 2646 1843 1802 1883 2793 4290 5715 5671 5206 5247 5500 5486 6120 7341 8584 9671 9975 9132 8255 8309 7949 6411 4461
29 2678 1827 1409 1678 1948 3056 5213 6852 6695 5481 5234 5163 5220 6305 7630 9249 10105 11113 10411 9301 9270 9114 6992 4323
30 2401 1510 1112 1403 1841 3216 5757 7596 7611 6064 5987 6090 6423 7249 8396 10243 11554 12126 12561 11024 10836 10042 8275 4723
31 2174 1394 1087 919 773 997 1561 2169 2410 2525 2564 2777 2954 3280 4104 5099 5386 5308 5350 4898 4819 5064 5164 3961

Report Uber Rides In New York City¶

Weekday vs. Weekend Rides:

The report shows the number of rides taken on weekdays and weekends. This gives an insight into how much the ride demand varies between weekdays and weekends. The number of rides taken during weekdays is higher than weekends. On average, there are 10,000 weekday rides and 8,000 weekend rides. the majority of the rides (about 75%) were taken on weekdays, while only a small percentage (about 25%) were taken on weekends. This information can be helpful for understanding the demand patterns of Uber rides in New York City and also It can be useful for Uber to plan their resources accordingly.

Hourly Rush: The report shows the number of rides taken in each hour of the day. the hourly rush for Uber rides peaks during weekdays at 8 AM and 6 PM, while on weekends it peaks between 12 PM and 4 PM. The report shows that the busiest hours for Uber rides are between 5 PM to 8 PM on weekdays. This suggests that there is a high demand for Uber rides during the evening rush hour when people are getting off work. On weekends, the busiest hours are between 9 PM and 12 AM. This can be helpful for understanding the timing of high demand periods and when Uber drivers may be in the most demand.also helps to identify the peak hours when the demand is high and the off-peak hours when the demand is low and alos help Uber to allocate their drivers accordingly and improve their service efficiency.

Rush on Weekdays: The report shows the number of rides taken on each weekday. The maximum rush on weekdays is observed on Thursday followed by Tuesday and Wednesday.The rush is comparatively low on Monday and Friday. This can help Uber to identify the weekdays with the highest demand and plan their resources accordingly. It can also help to identify any trends or patterns in the data.

Location-Based Rush: The report shows the number of rides taken in different locations in New York City. The highest rush is observed in the Midtown area of Manhattan, followed by the Financial District and Brooklyn. The rush is comparatively lower in Queens and the Bronx.the busiest locations for Uber rides in New York City. The top locations are the Financial District, Midtown Manhattan and Williamsburg. This information can be useful for Uber drivers who want to maximize their earnings by driving in high demand areas.This can help Uber to identify the locations with the highest demand and allocate their drivers accordingly. It can also help to identify any areas where Uber might need to improve their service.

Overall, the report provides valuable insights into the demand patterns and popular locations for Uber rides in New York City. This information can be helpful for both Uber drivers and the company to make decisions on when and where to allocate resources to meet the demand for rides also helps to improve their service efficiency, and provide a better customer experience.